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IN THE CLAIMS 

Please amend or retain the claims as follows: 

1. (Currently Amended) A method of setting up a DLSI space-based classifier for 
document classification and classifying a document according to a plurality of clusters within a 
database using said classifier comprising the steps of: 

preprocessing documents to distinguish terms of a word and a noun phrase from stop 

words; 

constructing system terms by setting up a term list as well as global weights; 
normalizing document vectors of collected documents, as well as centroid vectors of each 

cluster; 

constructing a differential term by intra-document matrix D™ xn ' , such that each column 
in said matrix is a differential intra-document vector; 

decomposing the differentiakterm by intra-document matrix D t , by an SVD algorithm, 
into D, = U l S ! Vj{S I =diag{S IV S l2 ,- : )) , followed by a composition of D lk{ =U k S k V^ giving 
an approximate D ! in terms of an appropriate k l ; 

setting up a likelihood function of intra-differential document vector; 

constructing a term by extra-document matrix D E X " E , such that each column of said 
extra-document matrix is an extra-differential document vector; 

decomposing D E , by exploiting the SVD algorithm, into 
D e = u e s e v I( S e = dia 8(^E,v S E^ ' 0) . then with a proper k E , defining D E k£ = U kE S k V? B to 
approximate D E ; 

setting up a likelihood function of extra-differential document vector; 

setting up a posteriori function; and 
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classifying, on the basis of using the DLSI space-based classifie r, the document as 
belonging to one of the plurality of clusters within the database to automatically classify a 
document . 

2. (Currently Amended) An automatic document classification method using a DLSI 
space-based classifier for classifying to classify a document in accordance with clusters in a 
database, comprising the steps of : 

a) setting up a document vector by generating terms as well as frequencies of 
occurrence of said terms in the document, so that a normalized document vector N is obtained 
for the document; 

b) constructing, using the document to be classified, a differential document vector 
x = N - C, where C is the normalized vector giving a center or centroid of a cluster; 

c) calculating an intra-document likelihood function P(x | D 7 ) for the document; 

d) calculating an extra-document likelihood function P(x \ D E ) for the document; 

e) calculating a Bayesian posteriori probability function P(Dj \ x) ; 

f) repeating, for each of the clusters of the data base, steps b-e; 

g) selecting a cluster having a largest P{D l \ x) as the cluster to which the document 
most likely belongs; and 

h) classifying the document in the selected cluster within said database . 



3. (Original) The method as set forth in claim 2, wherein the normalized document 
vector N is obtained using an equation, b tj = a {j j ^H^l • 

4. (Currently Amended) A method of setting up a DLSI space-based classifier for 
document classification and classifying a document according to a plurality of clusters within a 
database using said classifier , comprising the steps of: 
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setting up a differential term by intra-document matrix where each column of the matrix 
denotes a difference between a document and a centroid of a cluster to which the document 
belongs; 

decomposing the differential term by intra-document matrix by an SVD algorithm to 
identify an intra-DLSI space; 

setting up a probability function for a differential document vector being a differential 
intra-document vector; 

calculating the probability function according to projection and distance from the 
differential document vector to the intra-DLSI space; 

setting up a differential term by extra-document matrix where each column of the matrix 
denotes a differential document vector between a document vector and a centroid vector of a 
cluster which does not include the document; 

decomposing the differential term by extra-document matrix by an SVD algorithm to 
identify an extra-DLSI space; 

setting up a probability function for a differential document vector being a differential 
extra-document vector; 

setting up a posteriori likelihood function using the differential intra-document and 
differential extra-document vectors to provide a most probable similarity measure of a document 
belonging to a given cluster; and 

classifying, using the DLSI space-based classifie r, the document as belonging to one of 
the plurality of clusters within the database to automatically classify a docum e nt . 

5. (Original) The method as set forth in claim 4, wherein the step of setting up a 
probability function for a differential document vector being a differential intra-document 
vector is performed using an equation, 
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P(x\D,) = 



1/2 '*/ 

n, exp — - 




IP, j 



where y = U T k/ x, e 2 (x) =|| x f - £ y, 2 , A = 



1 



^Sf i , and r, is the rank of matrix 



r, -k, 



7 i=*,+l 



A. 



6. (Original) The method as set forth in claim 4, wherein the step of setting up a 
probability function for a differential document vector being a differential extra-document vector 
is performed using an equation, 



7. (Original) The method as set forth in claim 4, wherein the step of setting up a 
posteriori likelihood function is performed using an equation, 



where P{D } ) is set to l/n c where n c is the number of clusters in the database and 
P(D E ) is set to 1-P(D 7 ). 

8. (Currently Amended) The method as set forth in claim 4, the step of classifying the 
document using the DLSI space-based classifier to automatically classify a document comprising 
the steps of: 




E i=k E +l 



X^£,i » r E IS th e ran ^ °f ma trix D E . 



P(D,\x) = 



P(x | £,)/>(/),) 



/>(x|D / )P(D / ) + PU|D £ )/>(D £ ) , 
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a) setting up a document vector by generating terms as well as frequencies of 
occurrence of said terms in the document, so that a normalized document vector N is obtained 
for the document; 

b) constructing, using the document to be classified, a differential document vector 
x = N - C , where C is the normalized vector giving a center or centroid of a cluster; 

c) calculating an intra-document likelihood function P(x | D,) for the document; 

d) calculating an extra-document likelihood function P(x | D E ) for the document; 

e) calculating a Bayesian posteriori probability function P(D i \ x) ; 

f) repeating, for each of the clusters of the data base, steps b-e; 

g) selecting a cluster having a largest P(D, | x) as the cluster to which the document 
most likely belongs; and 

h) classifying the document in the selected cluster. 

9. (Original) The method as set forth in claim 8, wherein the normalized document 
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